Back to IRIX

Pipeline, March/April 1996, vol.7, no.2
Copyright © 1996 Silicon Graphics


Restarting Sendmail


This article is intended to assist system administrators in recognizing when
sendmail(1M) has encountered a problem and is no longer processing mail. In
addition, information is provided to assist system administrators in correctly
stopping the mail system, recovering messages queued to be sent, and restarting
sendmail. This article does not cover configuring or debugging the mail
system.

Every Silicon Graphics system comes with an electronic mail system as a standard part of the IRIX operating system. There are many programs that make up the mail system. Each of these programs has specific functions and responsibilities. Some programs, called mail user agents (MUAs), help a person compose and read mail messages (e.g., zmail(1), mail_bsd(1), mail_att(1)). Other programs, called mail transfer agents (MTAs), deliver messages once they are composed. One such MTA program is the sendmail program.

The sole function of the sendmail program is to deliver messages to one or more recipients. The recipients may be on the local system or on one or more remote systems reachable via networks. The network can be an on-demand connection such as a modem SLIP link, or it could be a dedicated Ethernet network. Sendmail does not care about the network used, and is only concerned with whether there is any connectivity to the remote systems.

Sendmail examines the recipient's address in the mail message to determine if the message is to be delivered locally on the same system, or remotely to another system. If the delivery is local, sendmail places the mail in the local user's mailbox, /var/mail/<username>. If the recipient is on a remote system, sendmail routes the message through other systems as necessary to deliver the message to the final destination system and recipient.

Unfortunately, under certain circumstances, sendmail may no longer be able to process incoming or outgoing messages. The remainder of this article will discuss how to recognize and recover from this situation.

Determine if Sendmail is Processing

During normal operation, sendmail runs as a daemon, accepting incoming mail messages from other systems and sending messages destined for other users on other systems. To accommodate the simultaneous incoming and outgoing activity, sendmail forks copies of itself, known as child processes.

To determine the number of sendmail processes running at any particular time, execute the following command:

% ps -ef | grep sendmail | grep -v grep

root  23  17  0  10:55:57  ?  0:00  /usr/lib/sendmail  -bd  -q15m  -T6d 
root  26  17  1  10:55:58  ?  0:00  /usr/lib/sendmail  -bd  -q15m  -T6d 
root  18  17  0  10:55:55  ?  0:00  /usr/lib/sendmail  -bd  -q15m  -T6d 
root  17  1   0  10:55:55  ?  0:00  /usr/lib/sendmail  -bd  -q15m  -T6d 
root  28  26  2  10:55:58  ?  0:00  /usr/lib/sendmail  -bd  -q15m  -T6d
In this particular case, five sendmail programs are running. The number of sendmail programs running at any one time will vary depending on the number of outgoing and incoming messages, but there should always be at least one sendmail program running.

Mailq(1M) is another useful program that can be used to check to see if sendmail is running and processing mail messages. Mailq prints the current contents of the mail queue. The entries displayed are the messages currently queued for delivery by sendmail.

When a Problem Occurs

Under normal circumstances, the sendmail program is very robust. However, it is also very complex, and difficulties can arise in configuring and maintaining sendmail. When configuration errors occur, sendmail may be unable to process mail. Correcting these errors and configuring sendmail is outside the scope of this article. Interested readers should examine the "References" section at the end of this article for information on configuring and troubleshooting sendmail.

Symptoms

When sendmail stops normal processing of incoming and outgoing mail, several possible symptoms may be evident.

One of the more common symptoms is that when sendmail programs from other systems attempt to contact the local system, they may be unable to deliver the mail and may report an error message (sometimes called a "bounce") back to the message sender. Local users may find that they are unable to send mail at all and programs like zmail may return errors.

Another symptom is that attempts to run commands on the system result in the error message No more processes. This message occurs because the normal action of forking copies of sendmail has resulted in a large number of sendmail programs that have taken over all available process slots on the system. When this happens, if it is possible to run the following command, the resulting list of sendmail processes may be several dozen to hundreds long.

% ps -ef | grep sendmail
In many cases, this command will not run at all because there aren't enough process slots available on the system.

Lastly, when the sendmail program has stopped delivering messages, the mailq program will either take a very long time to display the mail queue (if the queue is very long, only the first 1000 messages are displayed), or it may not display at all.

Stopping the Mail System

The first step in restoring the mail system back to normal operation is to stop the current mail system in an orderly fashion. To complete this task, execute the following steps:

  1. The following steps require superuser privileges. Either login as root, or become the root user on the system:
    	% /bin/su - 
    	Password: <enter password> 
    
  2. Stop the mail system.
    	# /etc/init.d/mail stop
    
  3. Since the mail system was in an abnormal condition when it was stopped, it is possible that some of the sendmail processes were not terminated. To ensure all sendmail programs terminate, issue the following command.
    	# /etc/killall -9 sendmail
    

Remove Empty Messages

Messages are stored in the directory /var/spool/mqueue. Zero length messages are either messages that were delivered just before the sendmail program stopped normal processing, or messages that were corrupted upon receipt. Either way, they can be removed with the following command.

# /bin/find /var/spool/mqueue -size 0 -exec /bin/rm {} \;

Restart the Mail System

Restarting the mail system consists of starting the initial sendmail process. Once the first sendmail program is started, additional sendmail processes will be forked as needed to deliver or receive mail. Use the following command to restart the mail system.

# /etc/init.d/mail start 

Mailer daemons: sendmail.

At this point, the parent sendmail process starts, notices that there are mail messages to be sent and received, and forks additional sendmail processes to deliver and receive mail. Normal mail operation can be verified using the information in the section titled "Determine if Sendmail is Processing". If a large number of files existed in the directory /var/spool/mqueue, this number should begin to decrease as sendmail delivers these messages. A test message can also be sent to confirm proper sendmail opera- tion.

Clear the Mail Queue Entirely

Sometimes the steps outlined above only temporarily restarts the mail system, and it may stop processing a short time later. If this happens, it is very possible that the message files contained in the directory /var/spool/mqueue are corrupt, and are causing sendmail to malfunction.

In this case, the easiest solution is to clear the directory of all pending messages and restart the mail system with an empty queue. This may be done by executing the steps below.

  1. The following steps require superuser privileges. Either login as root, or become the root user on the system:
    	% /bin/su - 
    	Password: <enter password> 
    
  2. Stop the mail system.
    	# /etc/init.d/mail stop
    
  3. Since the mail system was in an abnormal condition when it was stopped, it is possible that some of the sendmail processes were not terminated. To ensure all sendmail programs terminate, issue the following command.
    	# /etc/killall -9 sendmail
    
  4. Determine the date and time of the oldest mail message that is still waiting to be delivered.
    	# ls -lt /var/spool/mqueue/df* | tail -1
    
    	-rw------- 1 root sys 26390 Jan 17 08:23 dfIAA26896
    
    In this instance, the date and time of the oldest mail message waiting to be delivered is 8:23AM on January 17th. As a courtesy to the users of the system, the system administrator should inform users that mail before this date was probably delivered without any problem, and that mail sent between that date and time, and the date and time of this problem notification may be delayed or lost. (The next section will attempt to salvage and restore pending mail messages.)

    It is not possible to be sure exactly when mail ceased to be delivered. A user sending a message from their mail user agent at 8:22AM can't be sure that it got into the mail queue by 8:23AM, especially if there are internal mail hubs involved in mail processing.

  5. Move all of the current mail messages in the /var/spool/mqueue directory to another directory for possible manually sending at a later date (refer to the section titled "Manually Sending Undelivered Mail").
    	# mkdir /var/spool/mqueue.old 
    
    	# mv /var/spool/mqueue/*  /var/spool/mqueue.old
    
  6. Restart the mail system.
    	# /etc/init.d/mail start 
    

Mailer daemons: sendmail.

At this point, the mail system will not have any mail messages to deliver. The mailq command will report Mail queue is empty or it will list only a few messages which may have entered the mail system since it was restarted. Either condition would indicate that the mail system, and sendmail in particular, is processing mail normally. A test message can also be sent to confirm proper sendmail operation.

Manually Sending Undelivered Mail

In the steps above, efforts were made to save undelivered mail to a recovery directory. Recovery of mail messages from this directory is a manual and tedious operation but it does offer a method for recovery of critical messages.

To recover these messages, it is first necessary to find the files that make up a mail message. Each mail message is composed of a qf and a df file, where the filenames are of the format qf$$$##### and df$$$##### (where $ represents a letter, and a # represents a number). Related qf and df files have the same letters and digits, and only the first letter (d and q) differ.

The qf file contains mail header information including items such as the sender, recipient and subject. The df file contains the text of the mail message. If the system administrator can find matching qf and df files, the two files can be pieced together and sent on to the recipient.

The recipient can be identified by examining the qf file for a line that begins with the letter R. For example, if the file qfAAD00312 had a line that said:

R<joecoder@mudpit.org>
This would indicate that the message contained in the file dfAAD00312 was intended for joecoder@mudpit.org. The system administrator can now reassemble and mail the message using the following command:
# mail joecoder@mudpit.org < qfAAD00312 dfAAD00312
Note that because the system administrator is reassembling and sending this mail, it will appear to joecoder@mudpit.org that it was sent from the system administrator, and not from the original sender.

If only the qf file can be found, the system administrator can search the file for a line that starts with F. This is the from line and indicates the sender.

F<plainjane@skyzone.com>
If desired, the system administrator can send mail to the sender stating that the mail was lost.

For df files that do not have matching qf files, it may be possible to examine the body of the message in order to determine the recipient or sender. This is much more tedious, and potentially invasive.

Conclusion

The mail system is generally well behaved and easy to maintain and administer. However, as with any complex system, problems can occur. This article has presented information to assist system administrators in stopping and starting the mail system in an orderly fashion, and in clearing backed up mail messages.

References

The following documents provide additional information on mail and sendmail. The presence or absence of any particular reference should not be construed as a comment on its usefulness.

Chapter 8, titled "IRIX Sendmail", in the IRIX Admin: Networking and Mail Manual, available on-line with InSight(1).

Sendmail by Bryan Costales, Eric Allman, and Neil Rickert, O'Reilly and Associates, Inc., Publisher (ISBN 1-56592-056-2)

Sendmail Theory and Practice by Fredrick M. Avolio and Paul Vixie, Digital Press (ISBN 1-55558-127-7)